23 research outputs found

    Principles for data analysis workflows

    Full text link
    Traditional data science education often omits training on research workflows: the process that moves a scientific investigation from raw data to coherent research question to insightful contribution. In this paper, we elaborate basic principles of a reproducible data analysis workflow by defining three phases: the Exploratory, Refinement, and Polishing Phases. Each workflow phase is roughly centered around the audience to whom research decisions, methodologies, and results are being immediately communicated. Importantly, each phase can also give rise to a number of research products beyond traditional academic publications. Where relevant, we draw analogies between principles for data-intensive research workflows and established practice in software development. The guidance provided here is not intended to be a strict rulebook; rather, the suggestions for practices and tools to advance reproducible, sound data-intensive analysis may furnish support for both students and current professionals

    A sister of PIN1 gene in tomato (Solanum lycopersicum) defines leaf and flower organ initiation patterns by maintaining epidermal auxin flux

    Get PDF
    AbstractThe spatiotemporal localization of the plant hormone auxin acts as a positional cue during early leaf and flower organogenesis. One of the main contributors to auxin localization is the auxin efflux carrier PIN-FORMED1 (PIN1). Phylogenetic analysis has revealed that PIN1 genes are split into two sister clades; PIN1 and the relatively uncharacterized Sister-Of-PIN1 (SoPIN1). In this paper we identify entire-2 as a loss-of-function SlSoPIN1a (Solyc10g078370) mutant in Solanum lycopersicum. The entire-2 plants are unable to specify proper leaf initiation leading to a frequent switch from the wild type spiral phyllotactic pattern to distichous and decussate patterns. Leaves in entire-2 are large and less complex and the leaflets display spatial deformities in lamina expansion, vascular development, and margin specification. During sympodial growth in entire-2 the specification of organ position and identity is greatly affected resulting in variable branching patterns on the main sympodial and inflorescence axes. To understand how SlSoPIN1a functions in establishing proper auxin maxima we used the auxin signaling reporter DR5: Venus to visualize differences in auxin localization between entire-2 and wild type. DR5: Venus visualization shows a widening of auxin localization which spreads to subepidermal tissue layers during early leaf and flower organogenesis, showing that SoPIN1 functions to focus auxin signaling to the epidermal layer. The striking spatial deformities observed in entire-2 help provide a mechanistic framework for explaining the function of the SoPIN1 clade in S.lycopersicum

    Morphological Plant Modeling: Unleashing Geometric and Topological Potential within the Plant Sciences

    Get PDF
    The geometries and topologies of leaves, flowers, roots, shoots, and their arrangements have fascinated plant biologists and mathematicians alike. As such, plant morphology is inherently mathematical in that it describes plant form and architecture with geometrical and topological techniques. Gaining an understanding of how to modify plant morphology, through molecular biology and breeding, aided by a mathematical perspective, is critical to improving agriculture, and the monitoring of ecosystems is vital to modeling a future with fewer natural resources. In this white paper, we begin with an overview in quantifying the form of plants and mathematical models of patterning in plants. We then explore the fundamental challenges that remain unanswered concerning plant morphology, from the barriers preventing the prediction of phenotype from genotype to modeling the movement of leaves in air streams. We end with a discussion concerning the education of plant morphology synthesizing biological and mathematical approaches and ways to facilitate research advances through outreach, cross-disciplinary training, and open science. Unleashing the potential of geometric and topological approaches in the plant sciences promises to transform our understanding of both plants and mathematics

    Whole Genome Sequences of 23 Species from the Drosophila montium Species Group (Diptera: Drosophilidae): A Resource for Testing Evolutionary Hypotheses

    No full text
    Large groups of species with well-defined phylogenies are excellent systems for testing evolutionary hypotheses. In this paper, we describe the creation of a comparative genomic resource consisting of 23 genomes from the species-rich Drosophila montium species group, 22 of which are presented here for the first time. The montium group is well-positioned for clade genomics. Within the montium clade, evolutionary distances are such that large numbers of sequences can be accurately aligned while also recovering strong signals of divergence; and the distance between the montium group and D. melanogaster is short enough so that orthologous sequence can be readily identified. All genomes were assembled from a single, small-insert library using MaSuRCA, before going through an extensive post-assembly pipeline. Estimated genome sizes within the montium group range from 155 Mb to 223 Mb (mean = 196 Mb). The absence of long-distance information during the assembly process resulted in fragmented assemblies, with the scaffold NG50s varying widely based on repeat content and sample heterozygosity (min = 18 kb, max = 390 kb, mean = 74 kb). The total scaffold length for most assemblies is also shorter than the estimated genome size, typically by 5–15%. However, subsequent analysis showed that our assemblies are highly complete. Despite large differences in contiguity, all assemblies contain at least 96% of known single-copy Dipteran genes (BUSCOs, n = 2,799). Similarly, by aligning our assemblies to the D. melanogaster genome and remapping coordinates for a large set of transcriptional enhancers (n = 3,457), we showed that each montium assembly contains orthologs for at least 91% of D. melanogaster enhancers. Importantly, the genic and enhancer contents of our assemblies are comparable to that of far more contiguous Drosophila assemblies. The alignment of our own D. serrata assembly to a previously published PacBio D. serrata assembly also showed that our longest scaffolds (up to 1 Mb) are free of large-scale misassemblies. Our genome assemblies are a valuable resource that can be used to further resolve the montium group phylogeny; study the evolution of protein-coding genes and cis-regulatory sequences; and determine the genetic basis of ecological and behavioral adaptations

    Whole Genome Sequences of 23 Species from the Drosophila montium Species Group (Diptera: Drosophilidae): A Resource for Testing Evolutionary Hypotheses.

    Get PDF
    Large groups of species with well-defined phylogenies are excellent systems for testing evolutionary hypotheses. In this paper, we describe the creation of a comparative genomic resource consisting of 23 genomes from the species-rich Drosophila montium species group, 22 of which are presented here for the first time. The montium group is well-positioned for clade genomics. Within the montium clade, evolutionary distances are such that large numbers of sequences can be accurately aligned while also recovering strong signals of divergence; and the distance between the montium group and D. melanogaster is short enough so that orthologous sequence can be readily identified. All genomes were assembled from a single, small-insert library using MaSuRCA, before going through an extensive post-assembly pipeline. Estimated genome sizes within the montium group range from 155 Mb to 223 Mb (mean = 196 Mb). The absence of long-distance information during the assembly process resulted in fragmented assemblies, with the scaffold NG50s varying widely based on repeat content and sample heterozygosity (min = 18 kb, max = 390 kb, mean = 74 kb). The total scaffold length for most assemblies is also shorter than the estimated genome size, typically by 5-15%. However, subsequent analysis showed that our assemblies are highly complete. Despite large differences in contiguity, all assemblies contain at least 96% of known single-copy Dipteran genes (BUSCOs, n = 2,799). Similarly, by aligning our assemblies to the D. melanogaster genome and remapping coordinates for a large set of transcriptional enhancers (n = 3,457), we showed that each montium assembly contains orthologs for at least 91% of D. melanogaster enhancers. Importantly, the genic and enhancer contents of our assemblies are comparable to that of far more contiguous Drosophila assemblies. The alignment of our own D. serrata assembly to a previously published PacBio D. serrata assembly also showed that our longest scaffolds (up to 1 Mb) are free of large-scale misassemblies. Our genome assemblies are a valuable resource that can be used to further resolve the montium group phylogeny; study the evolution of protein-coding genes and cis-regulatory sequences; and determine the genetic basis of ecological and behavioral adaptations

    Left–right leaf asymmetry in decussate and distichous phyllotactic systems

    No full text
    Leaves in plants with spiral phyllotaxy exhibit directional asymmetries, such that all the leaves originating from a meristem of a particular chirality are similarly asymmetric relative to each other. Models of auxin flux capable of recapitulating spiral phyllotaxis predict handed auxin asymmetries in initiating leaf primordia with empirically verifiable effects on superficially bilaterally symmetric leaves. Here, we extend a similar analysis of leaf asymmetry to decussate and distichous phyllotaxy. We found that our simulation models of these two patterns predicted mirrored asymmetries in auxin distribution in leaf primordia pairs. To empirically verify the morphological consequences of asymmetric auxin distribution, we analysed the morphology of a tomato sister-of-pin-formed1a (sopin1a) mutant, entire-2, in which spiral phyllotaxy consistently transitions to a decussate state. Shifts in the displacement of leaflets on the left and right sides of entire-2 leaf pairs mirror each other, corroborating predicted model results. We then analyse the shape of more than 800 common ivy (Hedera helix) and more than 3000 grapevine (Vitis and Ampelopsis spp.) leaf pairs and find statistical enrichment of predicted mirrored asymmetries. Our results demonstrate that left-right auxin asymmetries in models of decussate and distichous phyllotaxy successfully predict mirrored asymmetric leaf morphologies in superficially symmetric leave
    corecore